Khmer Spell Checker

نویسنده

  • Puthick Hok
چکیده

Khmer is the official language of Cambodia. It is a complex language. Similar to Chinese, Japanese and Thai, Khmer words are written without spaces or other word delimiters. This is a major challenge in spell checking Khmer since there is no simple way to determine word boundaries. However, it is feasible to spell check Khmer. The process of spell checking Khmer is different from the spell checking process in other languages that have word delimiters like English. In Khmer, words are constructed from root words that are made up of consonantal clusters, which can be misspelled. In order to do the spell checking, first we need to find the approximate clusters of each input clusters. We then give the possible sequences of the consonantal clusters to a hidden Markov model. The model will give the score of every sequence of consonantal clusters. Based the possible sequences and their scores, we know the word boundaries, whether or not a word is correctly spelled and some alternative words if it is misspelled.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ویرایش‌گر متن شریف: سامانۀ ویرایش و خطایابی املایی زبان فارسی

In this paper, we will introduce an intelligent system to edit and spell check Persian texts. The goal is editing and preprocessing Persian texts for natural language processing tasks. This system is based on an expandable and engineering approach and is composed of three subsystems: Persian text editor, spell checker and stemmer. These parts interact with each other to edit texts. To do this, ...

متن کامل

A Survey of Spelling Error Detection and Correction Techniques

Spelling Correction is a process of detecting and sometimes providing suggestions for incorrectly spelled words in a text. Spell Checker is an application program that flags words in a document that may not be spelled correctly. Spell Checker may be stand-alone capable of operating on a block a text such as word processor, electronic dictionary. When some text is given as an input to spell chec...

متن کامل

Design and Implementation of Punjabi Spell Checker

Spellcheckers are the basic tools needed for word processing and document preparation. Designing a spell checker for Indian languages such as Punjabi poses many new challenges not found in English, which complicates the design of the spell checker. Punjabi language is far different from Western languages in phonetic properties and grammatical rules. Thus the existing algorithms and techniques t...

متن کامل

Context Sensitive Query Correction Method for Query-Based Text Summarization

Contextual spell correction is very important for real word error correction. It gives the correct word for an incorrect word in a particular sentence. The traditional spell checker can correct those misspelled words which are not present in dictionary but here we try to develop a spell checker which can give appropriate word on the basis of the contextual meaning of the sentence. This spell ch...

متن کامل

Improving Finite-State Spell-Checker Suggestions with Part of Speech N-Grams

We demonstrate a finite-state implementation of context-aware spell checking utilizing an N-gram based part of speech (POS) tagger to rerank the suggestions from a simple edit-distance based spell-checker. We demonstrate the benefits of context-aware spellchecking for English and Finnish and introduce modifications that are necessary to make traditional N-gram models work for morphologically mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005